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1.  Summary  of  Accomplishments 


The  goal  of  this  Phase  I  SBIR  program  is  to  demonstrate  feasibility  of  a  3 -dimensional 

RAM  based  on  superconductive  digital  electronics.  The  project  goal  has  been 

successfully  achieved.  Specifically,  HYPRES  performed  the  following  tasks: 

1.  A  comprehensive  study  of  possible  high-performance  superconductive  RAM 
architectures  has  been  conducted.  The  result  of  this  study  is  a  novel  block-access 
RAM  architecture  with  an  access  time  ~  300  ps. 

2.  Several  types  of  RAM  cell  have  been  evaluated  by  simulation.  Two  different  types  of 
these  RAM  cells  have  been  experimentally  evaluated. 

3.  As  a  result  of  this  study,  the  novel  RAM  cell  with  non-destructive  readout  has  also 
been  designed  and  evaluated.  It  exploits  three  shvmted  Josephson  junctions  and  has 
good  parameter  margins  (more  than  +/-32  %).  The  size  of  the  RAM  cell  (50x45  pm^ 
for  HYPRES’  3.5  pm  fabrication  process)  allows  us  to  place  a  16  Kbit  RAM  matrix 
on  a  1x1  cm^  chip. 

4.  A  new  control  current  driver  has  been  designed,  fabricated,  and  successfully  tested. 
The  current  driver  works  with  +/-  9%  DC  bias  current  margins. 

5.  In  order  to  assess  the  feasibility  of  a  3D  memory  module,  the  interconnection  between 
stacked  chips  has  been  designed  and  fabricated.  A  chip-to-chip  pulse  transfer 
experiment  has  been  carried  out.  We  have  successfiilly  observed  SFQ  pulse 
propagation  through  laser  drilled  holes  in  a  chip  at  low  frequency. 

6.  Exceeding  the  Phase  I  objectives,  we  have  also  developed  a  novel  random  access 
memory  concept  called  Deep  Pipeline  RAM  (DPRAM).  This  RAM  is  a  full  pipeline 
structure  and  has  an  extremely  low  cycle  time  (below  30  ps). 
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2.  Description  of  Work  Performed 

1. 1  Development  of  a  RAM  architecture 

The  first  objective  of  the  Phase  I  was  to  refine  the  proposed  RAM  architecture.  We  have 
conducted  an  exclusive  study  to  find  the  best  solution  to  build  at  least  16  Kbit  RAM  with 
sub-nanosecond  access  time,  suitable  for  3D  packaging.  As  a  result,  a  novel  block-access 
RAM  architecture  has  been  developed  (Fig.  1) 
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Fig.  1.  Block  diagram  of  16  Kbit  RAM  chip,  using  a  novel  block-access  architecture. 
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The  block-access  non-destractive  readout  superconductive  RAM  is  shown  in  Fig.  1.  The 
basic  feature  of  this  RAM  is  its  row-access.  Instead  of  individual  bits,  the  proposed 
RAM  stores  the  32-bit  words.  The  16  Kbit  1x1  cm^  RAM  chip  consists  of  4  blocks. 
Each  block  has  128x32  RAM  matrix  with  a  row  access.  The  incoming  data  and 
addresses  go  through  the  Y  demultiplexer  to  a  corresponding  block,  according  to  the  first 
two  bits  of  address.  After  that,  the  rest  of  the  address  (7  bits)  goes  to  a  row  decoder  and 
selects  the  corresponding  row.  The  W/R  bit  sets  the  direction  of  current  in  the  X  select 
line.  The  32-bit  data  indicates  the  WRITE  operation  and  goes  in  parallel  through  the  Y- 
select  lines,  writing  down  a  word  to  the  selected  row.  The  address  decoder  and 
demultiplexer  are  based  on  a  Rapid  Single-Flux  Quantum  (RSFQ)  logic. 

The  speed  of  signal  propagation  along  the  superconductive  microstrip  line  in  the  case  of 
Si02  insulator  is  0.3  c,  where  c  is  a  speed  of  light.  It  gives  us  a  signal  propagation  delay 
time  of  about  50  ps.  The  estimated  access  time  of  this  RAM  (including  the  delays  in 
decoders)  is  300  ps. 

The  advantages  of  this  architecture  are: 

•  The  row  access  simplifies  the  RAM  cell  reducing  its  size. 

•  The  RSFQ  decoders  do  not  need  to  be  driven  by  AC  currents,  thus  reducing  power 
dissipation  and  crosstalk,  allowing  the  opportunity  for  3D  packaging. 

•  The  signal  propagates  along  impedance  matched  microstrip  lines  near  the  speed  of 
light,  reducing  latencies  and  access  times. 

•  The  non-destructive  readout  and  non-volatile  cell  characteristics  eliminate  the  need 
for  a  REFRESH  operation,  which  dynamic  RAMs  require,  reducing  the  cycle  time. 

•  The  block-access  allows  us  to  transfer  data  in  and  out  the  chip  in  parallel,  increasing 
the  efficiency  of  the  lower  performance  chip-to-chip  interface. 

1.2  Evaluation  of  RAM  cells 

We  have  both  theoretically  and  experimentally  evaluated  the  version  of  the  RSFQ  RAM 
cell  suggested  in  Phase  I  proposal  (Fig.  2).  We  have  encountered  some  effects,  which 
can  not  be  taken  into  account  during  simulation. 


Fig.  2  The  schematics  of  memory  cell  suggested  in  Phase  I  proposal. 
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Specifically,  the  testing  results  of  a  2x5  RAM  cell  array  (Fig.  3)  have  indicated  poor  DC 
bias  current  margins  resulting  in  bad  yield  of  cells.  Just  four  of  ten  cells  were  folly 
functional  within  very  narrow  DC  bias  current  margins. 


Fig.  3.  Layout  of  a  2x5  memory  cell  array  with  floating  grounds  and  common  bias  lines. 


As  a  result,  we  have  made  the  following  conclusions: 

•  The  floating  ground  approach  complicates  design  and  reduces  parameter  margins. 

•  Propagation  of  SFQ  pulses  through  the  Josephson  transmission  line  (JTL)  is  not 
efficient.  The  delay  of  pulse  propagation  can  not  be  compensated  by  microstrip 
capacitance  and  is  imacceptably  large. 

•  A  non-destructive  readout  and  non-volatile  RAM  cell  would  eliminate  the  need  for 
REFRESH  operation  reducing  the  cycle  time. 


1.3  A  novel  non-destructive  readout  RAM  cell 

After  careful  investigation  we  have  developed  new  RAM  concept  (Fig.  1).  A  proposed 
cell  for  this  architecture  is  shown  in  Fig.  4. 

We  have  designed,  fabricated,  and  experimentally  evaluated  a  new  version  of  the 
superconductive  RAM  cell.  This  NDRO  cell  consists  of  three  shimted  Josephson 
junctions  comprising  two  interferometers  with  a  common  inductance  (Fig.  4).  The 
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single-junction  interferometer  serves  as  a  storage  loop,  while  the  two-junction 
interferometer  serves  as  a  readout  SQUID. 


Fig.  4  Schematic  of  a  new  NFRO  memory  cell. 

The  row-access  architecture  allows  us  to  get  rid  of  one  extra  select  line  reducing  the 
space  occupied  by  the  cell,  while  also  solving  the  half-select  problem.  The  truth  table  for 
the  RAM  cell  operation  is  as  following.  The  sign  before  the  select  line  name  indicates 
the  control  current  direction. 


Table  1.  Truth  Table  for  NDRO  RAM  Design 


Operation 

Select  lines 

Access 

WRITE  1 

-l-X+Y 

Each  cell 

WRITE  0 

-X 

Entire  row 

READ 

+x 

Entire  row 

All  cells  in  a  column  are  sequentially  connected.  The  DC  current  flows  through  the 
readout  SQUIDs.  A  sensing  device  is  placed  at  the  end  of  each  colunm.  If  the  SQUID 
switches  to  resistive  state  during  the  READ  operation,  the  sensing  device  will  indicate  a 
signal  and  transform  it  into  an  SFQ  pulse. 

Experiment^  evaluation  shows  excellent  margins  for  this  cell.  The  minimal  critical 
current  margin  is  32%.  The  minimal  control  current  amplitude  margin  is  above  50%. 
The  DC  bias  current  of  readout  SQUIDs  has  36%  margins.  Moreover,  the  simplicity  and 
reliability  of  this  cell  are  very  suitable  for  the  large  integration  scale  memories. 
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1.4  The  dc  Current  Driver 

We  have  designed,  fabricated,  and  successfully  tested  an  amplifier  based  on  the  current 
driver  proposed  in  this  project.  Fig.  5  shows  a  schematic  of  this  circuit.  The  large 
inductance  loop  is  connecting  two  relaxation-oscillations  driven  pairs  of  unshunted 
Josephson  junctions.  The  dc  current  from  the  current  source  is  being  pushed  into  and  out 
of  the  inductance  loop,  which  is  magnetically  coupled  to  dc  SQUIDs  chain.  Fig.  6  shows 
successful  test  results  of  the  amplifier  based  on  DC  current  drivers. 


DC  Current 
driver 


Fig.  5.  Output  amplifier  based  on  DC  current  drivers. 


1  mV/div. 
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Fig.  6.  Low-frequency  test  results  of  output  amplifier.  The  output  voltage  amplitude  is 
3  mV,  in  contrast  to  the  custom  monitor  (0.2  mV) 
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1.5  Chip-to-chip  data  transfer 

We  have  proposed,  designed  and  tested  a  novel  method  to  implement  chip-to-chip 
interconnection,  using  holes  made  by  a  laser-drilling  technique.  Fig.  7  shows  the  idea 
proposed  in  Phase  I  to  experiment  with  chip-to-chip  SFQ  pulse  propagation. 

Vertically  stacked 


Fig.  7.  The  Phase  I  experiment  on  chip-to-chip  data  transfer. 


To  provide  intercoimections  with  uniform  impedance  we  proposed  to  drill  holes  of  about 
0.2  mm  in  diameter  in  the  vertically  stacked  chips.  Fig.  8  shows  microphotographs  of 
chips  with  holes  made  by  Laser  Light  Technology,  Inc.  (Missouri).  Fig.  8a  shows  a 
microphotograph  of  a  hole  made  through  the  metal  layer.  Due  to  the  reflection  of  laser 
irradiation  from  the  metal  surface,  edges  of  the  hole  have  numerous  defects.  As  a  result, 
the  contact  pad  is  damaged  and  cannot  be  soldered  to  a  wire.  We  have  found  that  for 
better  quality  all  metal  layers  from  the  place  assigned  for  the  hole  should  be  removed 
during  the  fabrication  process.  We  have  designed  special  contact  pads  with  free-of-metal 
targets  for  the  laser  drilling.  Fig.  8b  shows  the  result  of  laser  drilling  through  the  silicon 
substrate  without  any  metal  layer.  In  the  center  of  a  contact  pad  one  can  see  the  round 
targets,  where  all  metal  layers  were  removed. 


Fig.  8  Microphotographs  of  0.2  mm  holes  drilled  in  chip  with  laser  technique,  when  (a) 
The  holes  were  drilled  through  the  metal  layers  and  (b)  the  metal  layers  were  pre 
removed. 
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Contact  pad 
with  a  hole 


Contact  pad 
with  a  hole 

Fig.  9.  Block-diagram  of  the  chip-to-chip  data  transfer  experiment. 


The  block-diagram  of  our  experiment  with  a  chip-to-chip  data  transfer  is  shown  on  Fig.  9. 
We  have  designed  a  transmitter  capable  of  amplifying  SFQ  pulse  to  the  level,  sufficient 
to  transfer  this  pulse  through  the  chip-to-chip  interconnection  and  to  be  sensed  by  a 
sensitive  receiver  on  another  chip. 


Fig.  10.  The  front  (a)  and  the  bottom  (b)  side  of  a  chip  with  soldered  wire  connecting 
two  contact  pads  through  the  200  pm  holes. 

Fig.  9  shows  microphotographs  of  a  HYPRES  chip  with  laser  drilled  holes  connected  to 
gold  wire.  We  have  soldered  a  50-pm  wire  to  the  contact  pads.  On  Fig.  10(a)  one  can 
see  the  indium  bumps.  On  the  bottom  view  of  chip  (Fig.  10(b))  one  can  see  a  50  pm 
diameter  gold  wire  connecting  receiver  and  transmitter  pads.  The  resistance  of  this  wire 
was  1 .2  Q  at  room  temperature. 

Fig.  11  shows  low-frequency  testing  results.  The  most  critical  receiver’s  dc  bias  current 
margins  were  ±  9  %.  The  estimated  maximum  operational  frequency  of  this 
interconnection  is  about  3  GHz. 
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1. 6  Deep  Pipeline  Random  Access  Memory 

In  order  to  reduce  memory  cycle  time,  we  can  use  a  pipelined  approach.  This  pipeline 
structure  requires  extra  circuitry,  which  occupies  more  space  on  the  chip,  but 
dramatically  reduces  the  cycle  time.  The  proposed  Deep-Pipeline  RAM  (DPRAM) 
operates  in  a  block  access  mode.  It  contains  a  fully  pipelined  address  decoder  and  a 
memory  matrix  consisting  of  pipelined  rows.  The  number  of  rows  corresponds  to  the 
size  of  the  block  (Fig.  12).  To  achieve  a  30  ps  cycle  time  for  128x32  memory  array,  the 
matrix  block  is  divided  into  four  pipeline  stages.  Each  stage  is  connected  to  another 
using  timed  repeaters. 


Fig.  12.  Block-diagram  of  DPRAM. 

In  this  RAM,  data  moves  in  the  rows  synchronously  with  address  in  the  decoder.  When 
an  address  reaches  the  designated  column,  the  decoder  sends  a  signal  to  the  column, 
which  reads  a  word  stored  in  the  memory  cells  of  this  column  into  the  repeater.  Thus,  we 
have  a  fully  pipelined  RAM  with  cycle  time  equal  to  one  clock  period.  The  access  time 
is  a  number  of  clock  periods,  corresponding  to  the  number  of  pipeline  stages.  Therefore, 
reducing  RAM  capacity  to  1  Kbit  per  1  cm^  chip,  we  can  reduce  the  cycle  time  to  30  ps. 
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3.  Conclusion 

We  have  successfully  demonstrated  the  feasibility  of  a  superconductive,  sub-nanosecond 
access  time,  three-dimensional  random-access  memory.  We  have  developed  a  schematic 
of  the  16  Kbit  RAM  chip  with  300  ps  access  time  and  a  schematic  of  the  1  Kbit  Deep- 
Pipeline  RAM  with  a  cycle  time  below  30  ps.  The  components  for  the  proposed  RAM 
were  designed,  fabricated  and  successfully  tested.  We  have  designed  a  novel  NDRO 
memory  cell.  Due  to  the  novel  architectural  solution,  this  cell  exploits  only  three 
Josephson  junctions  and  has  a  size  as  small  as  50  pm  x  54  pm.  We  have  successfully 
demonstrated  the  data  propagation  between  two  vertically  stacked  chips  exploiting  the 
laser  drilling  technique. 

The  achieved  results  provide  us  a  solid  foundation  for  rapid  progress  in  Phase  II  of  the 
project.  HYPRES  will  upgrade  its  Nb  fabrication  process  to  1.5  pm  minimal  feature  size 
lithography.  After  that,  we  -will  be  able  to  increase  an  integration  scale  of  the  proposed 
RAM  to  64  Kbit  per  1  cm  chip. 

The  proposed  RAM  can  provide  a  substantial  performance  improvement  over  existing 
and  prospective  semiconductor  systems,  sufficient  to  yield  successful  commercialization. 
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